AGRaME: Any Granularity Ranking with Multi-Vector Embeddings

AuthorsRevanth Gangi Reddy, Omar Attia, Yunyao Li, Heng Ji, Saloni Potdar

Ranking is a fundamental and popular problem in search. However, existing ranking algorithms usually restrict the granularity of ranking to full passages or require a specific dense index for each desired level of granularity. Such lack of flexibility in granularity negatively affects many applications that can benefit from more granular ranking, such as sentence-level ranking for open-domain question-answering, or proposition-level ranking for attribution. In this work, we introduce the idea of any-granularity ranking which leverages multi-vector approaches to rank at varying levels of granularity while maintaining encoding at a single (coarser) level of granularity. We propose a multi-granular contrastive loss for training multi-vector approaches, and validate its utility with both sentences and propositions as ranking units. Finally, we demonstrate the application of proposition-level ranking to post-hoc citation addition in retrieval-augmented generation, surpassing the performance of prompt-driven citation generation.

Figure 1: Ranking at different levels of granularity. X→Y is used to denote that X represents the query granularity used for ranking, with entire query encoded, and Y indicates the granularity of the retrieval unit being ranked, with entire retrieval unit encoded. In addition to the typical ranking setting (A), our proposed approach enables ranking finer retrieval units (B and D) or using finer query units for ranking (C and D).

Figure 2: Figure demonstrating our sentence-level scoring methodology using multi-vector representations with encoding at passage-level. Query marker m_q is used while getting passage-level score P , while marker m′_q is used for getting sentence-level scores S1, S2, S3.

AGRaME: Any Granularity Ranking with Multi-Vector Embeddings

Related readings and updates.

Towards Automatic Assessment of Self-Supervised Speech Models Using Rank

Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks

Discover opportunities in Machine Learning.